Cancer is an heterogeneous disease caused by the continuous accumulation of somatic mutations during lifetime of an individual. Somatic mutations that affect the cells are divided in:


Order of restrictions are encoded in DAGs
Mutations occur only if the preceding parent mutations have occurred (monotonicity)
Genotypes can follow different paths during disease progression
Evolutionary tumor progression models allows deviations from monotonicity

|
|
|
The order in which somatic mutations are acquired influence clonal evolution
Therefore, the fitness a mutations depends in which mutations were acquired previously
|
|
|
Cancer progression is driven by the accumulation of somatic mutations that interact epistatically
This is, their effect is non-additive to the tumor fitness as a phenotype


Coexistance of multiple clones in time
Evolutionary game theory
Fitness is actually a function of the relative frequency of other clones

PathTiMex is a probabilistic model of tumor progression among mutually exclusive driver pathways. This model was used to infer cancer progression in colorectal cancer.

This generative model is mapped into an evolutionary model, where deviations from monotonicity are allowed.
## Restriction table (extended version of the poset)
colcancer <- data.frame(
parent = c(rep("Root",3), "A", "B", "C"), ## Parent nodes
child = c("A", "B", "D", "C", "E", "E"), ## Child nodes
s = c(0.5, 0.2, 0.05, 0.1, rep(0.05, 2)), ## Restrictions are satisfied
sh = -0.3, ## Deviations from monotonicity (penalization)
typeDep = "MN" ## Type of dependency
)
## Fitness specification of the poset
colcancer_efec <- allFitnessEffects(
colcancer, # Poset
geneToModule = c( ## Specification of the modules
"Root" = "Root",
"A" = "APC",
"B" = "TP53, EVC2",
"C" = "KRAS",
"D" = "PI3KCA, EPHA",
"E" = "FBXW7, TCF7L2"),
drvNames = c( ## Specification of drivers
"APC", "TP53", "EVC2", "KRAS",
"PI3KCA", "EPHA", "FBXW7", "TCF7L2"))
## DAG representation
plot(colcancer_efec, expandModules = TRUE, autofit = TRUE, lwdf = 2)
# Evaluation of all genotypes and its fitness
colcancer_efec_FL <- evalAllGenotypes(colcancer_efec, max = 110000)
## Plot of fitness landscape
plotFitnessLandscape(colcancer_efec_FL)
Simplified model with 4 genes: APC, TP53, FBXW7 and KRAS.
# DAG from the simplified model
plot(cc_visuali, expandModules = TRUE, autofit = TRUE, lwdf = 2)
Mutation's order is inferred with pathTiMEx through the waiting time rate parameter. It informs about the mutation's arise. Fitness value associated to each order is derived from this parameter.
| Gene/module | Waiting time rate parameter |
|---|---|
| APC | 9.5 |
| KRAS | 2.89 |
| TP53, EVC2 | 1.92 |
| PIK3CA, EPHA3 | 0.17 |
| FBXW7, TCF7L2 | 0.08 |
The order in mutations: APC > TP53 > KRAS > FBXW7 is consistent with restrictions and is associated to the highest fitness value. Different order deviates from monotoniciy and lower fitness is associated.
## Order effects
cc_order <- allFitnessEffects(
orderEffects = c("A > B > C > D" = 0.5,
"B > A > C > D" = 0.2,
"B > C > A > D" = 0.1,
"B > C > D > A" = 0.05,
"A > C" = 0.2,
"C > A" = 0.05,
"D > A" = 0.05,
"A > D" = 0.2,
"B > D" = 0.2,
"C > D" = 0.2,
"B > C" = 0.2,
"C > B" = 0.1,
"B > A" = 0.1,
"A > B" = 0.3),
geneToModule =
c("A" = "APC",
"B" = "KRAS",
"C" = "TP53",
"D" = "FBXW7") )
## Fitness associated to each genotype
(cc_order_geno <- evalAllGenotypes(cc_order, order = TRUE))
| Genotype | Fitness |
|---|---|
| APC | 1,0000 |
| FBXW7 | 1,0000 |
| KRAS | 1,0000 |
| TP53 | 1,0000 |
| APC > FBXW7 | 1,2000 |
| APC > KRAS | 1,3000 |
| APC > TP53 | 1,2000 |
| FBXW7 > APC | 1,0500 |
| FBXW7 > KRAS | 1,0000 |
| FBXW7 > TP53 | 1,0000 |
| KRAS > APC | 1,1000 |
| KRAS > FBXW7 | 1,2000 |
| KRAS > TP53 | 1,2000 |
| TP53 > APC | 1,0500 |
| TP53 > FBXW7 | 1,2000 |
| TP53 > KRAS | 1,1000 |
| APC > FBXW7 > KRAS | 1,5600 |
| APC > FBXW7 > TP53 | 1,4400 |
| APC > KRAS > FBXW7 | 1,8720 |
| APC > KRAS > TP53 | 1,8720 |
| APC > TP53 > FBXW7 | 1,7280 |
| APC > TP53 > KRAS | 1,7160 |
| FBXW7 > APC > KRAS | 1,3650 |
| FBXW7 > APC > TP53 | 1,2600 |
| FBXW7 > KRAS > APC | 1,1550 |
| FBXW7 > KRAS > TP53 | 1,2000 |
| FBXW7 > TP53 > APC | 1,1025 |
| FBXW7 > TP53 > KRAS | 1,1000 |
| KRAS > APC > FBXW7 | 1,5840 |
| KRAS > APC > TP53 | 1,5840 |
| ... | ... |
| TP53 > APC > FBXW7 | 1,512000 |
| TP53 > APC > KRAS | 1,501500 |
| TP53 > FBXW7 > APC | 1,323000 |
| TP53 > FBXW7 > KRAS | 1,320000 |
| TP53 > KRAS > APC | 1,270500 |
| TP53 > KRAS > FBXW7 | 1,584000 |
| APC > FBXW7 > KRAS > TP53 | 2,246400 |
| APC > FBXW7 > TP53 > KRAS | 2,059200 |
| APC > KRAS > FBXW7 > TP53 | 2,695680 |
| APC > KRAS > TP53 > FBXW7 | 4,852224 |
| APC > TP53 > FBXW7 > KRAS | 2,471040 |
| APC > TP53 > KRAS > FBXW7 | 2,965248 |
| FBXW7 > APC > KRAS > TP53 | 1,965600 |
| FBXW7 > APC > TP53 > KRAS | 1,801800 |
| FBXW7 > KRAS > APC > TP53 | 1,663200 |
| FBXW7 > KRAS > TP53 > APC | 1,455300 |
| FBXW7 > TP53 > APC > KRAS | 1,576575 |
| FBXW7 > TP53 > KRAS > APC | 1,334025 |
| KRAS > APC > FBXW7 > TP53 | 2,280960 |
| KRAS > APC > TP53 > FBXW7 | 3,284582 |
| KRAS > FBXW7 > APC > TP53 | 1,995840 |
| KRAS > FBXW7 > TP53 > APC | 1,746360 |
| KRAS > TP53 > APC > FBXW7 | 2,634509 |
| KRAS > TP53 > FBXW7 > APC | 2,200414 |
| TP53 > APC > FBXW7 > KRAS | 2,162160 |
| TP53 > APC > KRAS > FBXW7 | 2,594592 |
| TP53 > FBXW7 > APC > KRAS | 1,891890 |
| TP53 > FBXW7 > KRAS > APC | 1,600830 |
| TP53 > KRAS > APC > FBXW7 | 2,195424 |
| TP53 > KRAS > FBXW7 > APC | 1,920996 |
#DAG
plot(cc_order)
Error in `*tmp*`[[i]]: subíndice fuera de los límites Traceback: 1. plot(cc_order) 2. plot.fitnessEffects(cc_order) 3. plot(g1, edgeAttrs = list(arrowsize = a1, lty = s1, lwd = lwd, . color = c1), nodeAttrs = nAttrs) 4. plot(g1, edgeAttrs = list(arrowsize = a1, lty = s1, lwd = lwd, . color = c1), nodeAttrs = nAttrs) 5. .local(x, y, ...) 6. agopen(x, name = name, layout = TRUE, layoutType = y, attrs = attrs, . nodeAttrs = nodeAttrs, edgeAttrs = edgeAttrs, subGList = subGList, . recipEdges = recipEdges)
# Fitness landscape
plotFitnessLandscape(cc_order_geno)
Error in to_Fitness_Matrix(x, max_num_genotypes = max_num_genotypes): We cannot deal with order effects yet.
Traceback:
1. plotFitnessLandscape(cc_order_geno)
2. to_Fitness_Matrix(x, max_num_genotypes = max_num_genotypes)
3. stop("We cannot deal with order effects yet.")
Generative probabilistic model
The order in which mutations arise are better described at the pathway level instead of a gene level
Model mapped the progression model from colorectal cancer data

## Define poset restrictions, mapping of genes to modules, and driver genes
(CRC_W <- allFitnessEffects(data.frame(parent = c("Root", "A", "B", "C"),
child = c("A", "B", "C", "D"),
s = c(0.6, 0.4, 0.1, 0.05),
sh = -0.5,
typeDep = "MN"),
geneToModule = c("Root" = "Root",
"A" = "APC, EPHA3, TCF7L2",
"B" = "EVC2, PIK3CA, TP53",
"C" = "KRAS",
"D" = "FBXW7"),
drvNames = c("APC", "EPHA3", "TCF7L2", "EVC2", "PIK3CA",
"TP53", "KRAS", "FBXW7")))
$long.rt
$long.rt$A
$long.rt$A$child
[1] "A"
$long.rt$A$s
[1] 0.6
$long.rt$A$sh
[1] -0.5
$long.rt$A$typeDep
MN
"monotone"
$long.rt$A$parents
[1] "Root"
$long.rt$A$childNumID
A
1
$long.rt$A$parentsNumID
Root
0
$long.rt$B
$long.rt$B$child
[1] "B"
$long.rt$B$s
[1] 0.4
$long.rt$B$sh
[1] -0.5
$long.rt$B$typeDep
MN
"monotone"
$long.rt$B$parents
[1] "A"
$long.rt$B$childNumID
B
2
$long.rt$B$parentsNumID
A
1
$long.rt$C
$long.rt$C$child
[1] "C"
$long.rt$C$s
[1] 0.1
$long.rt$C$sh
[1] -0.5
$long.rt$C$typeDep
MN
"monotone"
$long.rt$C$parents
[1] "B"
$long.rt$C$childNumID
C
3
$long.rt$C$parentsNumID
B
2
$long.rt$D
$long.rt$D$child
[1] "D"
$long.rt$D$s
[1] 0.05
$long.rt$D$sh
[1] -0.5
$long.rt$D$typeDep
MN
"monotone"
$long.rt$D$parents
[1] "C"
$long.rt$D$childNumID
D
4
$long.rt$D$parentsNumID
C
3
$long.epistasis
list()
$long.orderEffects
list()
$long.geneNoInt
data frame with 0 columns and 0 rows
$geneModule
Gene Module GeneNumID ModuleNumID
1 Root Root 0 0
2 APC A 1 1
3 EPHA3 A 2 1
4 EVC2 B 3 2
5 FBXW7 D 4 4
6 KRAS C 5 3
7 PIK3CA B 6 2
8 TCF7L2 A 7 1
9 TP53 B 8 2
$gMOneToOne
[1] FALSE
$geneToModule
Root A B
"Root" "APC, EPHA3, TCF7L2" "EVC2, PIK3CA, TP53"
C D
"KRAS" "FBXW7"
$graph
IGRAPH c356bc7 DN-- 5 4 --
+ attr: name (v/c), typeDep (e/c), color (e/c), lty (e/n), arrow.mode
| (e/n)
+ edges from c356bc7 (vertex names):
[1] Root->A A ->B B ->C C ->D
$drv
[1] 1 2 3 4 5 6 7 8
$rT
parent child s sh typeDep
1 Root A 0.60 -0.5 MN
2 A B 0.40 -0.5 MN
3 B C 0.10 -0.5 MN
4 C D 0.05 -0.5 MN
$epistasis
NULL
$orderEffects
NULL
$noIntGenes
NULL
$fitnessLandscape
[,1]
$fitnessLandscape_df
data frame with 0 columns and 0 rows
$fitnessLandscape_gene_id
data frame with 0 columns and 0 rows
$fitnessLandscapeVariables
character(0)
$frequencyDependentFitness
[1] FALSE
$frequencyType
[1] "freq_dep_not_used"
attr(,"class")
[1] "fitnessEffects"
# DAG representation
plot(CRC_W, expandModules = TRUE, autofit = TRUE, lwdf = 2)
## Map genotypes to fitness
CRC_F <- evalAllGenotypes(CRC_W, order = FALSE, addwt = TRUE)
## Plot of fitness landscape
plot(CRC_F)
## Simplified model
## Define poset restrictions, mapping of genes to modules, and driver genes
(CRC_W2 <- allFitnessEffects(data.frame(parent = c(rep("Root", 2), "A", "B", "C"),
child = c("A", "B", rep("C", 2), "D"),
s = c(0.2, 0.1, rep(0.05, 2), 0.01),
sh = -0.5,
typeDep = c(rep("XMPN", 4), "MN")),
geneToModule = c("Root" = "Root",
"A" = "APC",
"B" = "TP53",
"C" = "KRAS",
"D" = "FBXW7"),
drvNames = c("APC", "TP53", "KRAS", "FBXW7")))
$long.rt
$long.rt$A
$long.rt$A$child
[1] "A"
$long.rt$A$s
[1] 0.2
$long.rt$A$sh
[1] -0.5
$long.rt$A$typeDep
XMPN
"xmpn"
$long.rt$A$parents
[1] "Root"
$long.rt$A$childNumID
A
1
$long.rt$A$parentsNumID
Root
0
$long.rt$B
$long.rt$B$child
[1] "B"
$long.rt$B$s
[1] 0.1
$long.rt$B$sh
[1] -0.5
$long.rt$B$typeDep
XMPN
"xmpn"
$long.rt$B$parents
[1] "Root"
$long.rt$B$childNumID
B
2
$long.rt$B$parentsNumID
Root
0
$long.rt$C
$long.rt$C$child
[1] "C"
$long.rt$C$s
[1] 0.05
$long.rt$C$sh
[1] -0.5
$long.rt$C$typeDep
XMPN
"xmpn"
$long.rt$C$parents
[1] "A" "B"
$long.rt$C$childNumID
C
3
$long.rt$C$parentsNumID
A B
1 2
$long.rt$D
$long.rt$D$child
[1] "D"
$long.rt$D$s
[1] 0.01
$long.rt$D$sh
[1] -0.5
$long.rt$D$typeDep
MN
"monotone"
$long.rt$D$parents
[1] "C"
$long.rt$D$childNumID
D
4
$long.rt$D$parentsNumID
C
3
$long.epistasis
list()
$long.orderEffects
list()
$long.geneNoInt
data frame with 0 columns and 0 rows
$geneModule
Gene Module GeneNumID ModuleNumID
1 Root Root 0 0
2 APC A 1 1
3 FBXW7 D 2 4
4 KRAS C 3 3
5 TP53 B 4 2
$gMOneToOne
[1] FALSE
$geneToModule
Root A B C D
"Root" "APC" "TP53" "KRAS" "FBXW7"
$graph
IGRAPH 440feb2 DN-- 5 5 --
+ attr: name (v/c), typeDep (e/c), color (e/c), lty (e/n), arrow.mode
| (e/n)
+ edges from 440feb2 (vertex names):
[1] Root->A Root->B A ->C B ->C C ->D
$drv
[1] 1 2 3 4
$rT
parent child s sh typeDep
1 Root A 0.20 -0.5 XMPN
2 Root B 0.10 -0.5 XMPN
3 A C 0.05 -0.5 XMPN
4 B C 0.05 -0.5 XMPN
5 C D 0.01 -0.5 MN
$epistasis
NULL
$orderEffects
NULL
$noIntGenes
NULL
$fitnessLandscape
[,1]
$fitnessLandscape_df
data frame with 0 columns and 0 rows
$fitnessLandscape_gene_id
data frame with 0 columns and 0 rows
$fitnessLandscapeVariables
character(0)
$frequencyDependentFitness
[1] FALSE
$frequencyType
[1] "freq_dep_not_used"
attr(,"class")
[1] "fitnessEffects"
# DAG representation
plot(CRC_W2, expandModules = TRUE, autofit = TRUE, lwdf = 2)
## Simplified Model
## Map genotypes to fitness
(CRC_F2 <- evalAllGenotypes(CRC_W2, order = FALSE, addwt = TRUE))
| Genotype | Fitness |
|---|---|
| <chr> | <dbl> |
| WT | 1.00000 |
| APC | 1.20000 |
| FBXW7 | 0.50000 |
| KRAS | 0.50000 |
| TP53 | 1.10000 |
| APC, FBXW7 | 0.60000 |
| APC, KRAS | 1.26000 |
| APC, TP53 | 1.32000 |
| FBXW7, KRAS | 0.50500 |
| FBXW7, TP53 | 0.55000 |
| KRAS, TP53 | 1.15500 |
| APC, FBXW7, KRAS | 1.27260 |
| APC, FBXW7, TP53 | 0.66000 |
| APC, KRAS, TP53 | 0.66000 |
| FBXW7, KRAS, TP53 | 1.16655 |
| APC, FBXW7, KRAS, TP53 | 0.66660 |
## Plot of fitness landscape
plot(CRC_F2, use_ggrepel = TRUE)
Restrictions imposed with XOR
Fitness values that reflect slightly deleterious effect (if two genes appear)
or a highly deleterious effect (if three genes appear).
## Simplified model
## Define poset restrictions, mapping of genes to modules, and driver genes
(CRC_W4 <- allFitnessEffects(data.frame(parent = c(rep("Root", 2), "A", "B", "C"),
child = c("A", "B", rep("C", 2), "D"),
s = c(0.2, 0.1, rep(0.05, 2), 0.01),
sh = -0.5,
typeDep = c(rep("XMPN", 4), "MN")),
epistasis = c("A : -B : -C" = 0.2,
"-A : B : -C" = 0.1,
"-A : -B : C" = 0.05,
"A : B : -C" = 0.01,
"-A : B : C" = 0.02,
"-B : A : C" = 0.02,
"A : B : C" = -0.5),
geneToModule = c("Root" = "Root",
"A" = "APC",
"B" = "TP53",
"C" = "KRAS",
"D" = "FBXW7"),
drvNames = c("APC", "TP53", "KRAS", "FBXW7")))
$long.rt
$long.rt$A
$long.rt$A$child
[1] "A"
$long.rt$A$s
[1] 0.2
$long.rt$A$sh
[1] -0.5
$long.rt$A$typeDep
XMPN
"xmpn"
$long.rt$A$parents
[1] "Root"
$long.rt$A$childNumID
A
1
$long.rt$A$parentsNumID
Root
0
$long.rt$B
$long.rt$B$child
[1] "B"
$long.rt$B$s
[1] 0.1
$long.rt$B$sh
[1] -0.5
$long.rt$B$typeDep
XMPN
"xmpn"
$long.rt$B$parents
[1] "Root"
$long.rt$B$childNumID
B
2
$long.rt$B$parentsNumID
Root
0
$long.rt$C
$long.rt$C$child
[1] "C"
$long.rt$C$s
[1] 0.05
$long.rt$C$sh
[1] -0.5
$long.rt$C$typeDep
XMPN
"xmpn"
$long.rt$C$parents
[1] "A" "B"
$long.rt$C$childNumID
C
3
$long.rt$C$parentsNumID
A B
1 2
$long.rt$D
$long.rt$D$child
[1] "D"
$long.rt$D$s
[1] 0.01
$long.rt$D$sh
[1] -0.5
$long.rt$D$typeDep
MN
"monotone"
$long.rt$D$parents
[1] "C"
$long.rt$D$childNumID
D
4
$long.rt$D$parentsNumID
C
3
$long.epistasis
$long.epistasis[[1]]
$long.epistasis[[1]]$ids
[1] "-C" "-B" "A"
$long.epistasis[[1]]$s
[1] 0.2
$long.epistasis[[1]]$NumID
C B A
-3 -2 1
$long.epistasis[[2]]
$long.epistasis[[2]]$ids
[1] "-C" "-A" "B"
$long.epistasis[[2]]$s
[1] 0.1
$long.epistasis[[2]]$NumID
C A B
-3 -1 2
$long.epistasis[[3]]
$long.epistasis[[3]]$ids
[1] "-B" "-A" "C"
$long.epistasis[[3]]$s
[1] 0.05
$long.epistasis[[3]]$NumID
B A C
-2 -1 3
$long.epistasis[[4]]
$long.epistasis[[4]]$ids
[1] "-C" "A" "B"
$long.epistasis[[4]]$s
[1] 0.01
$long.epistasis[[4]]$NumID
C A B
-3 1 2
$long.epistasis[[5]]
$long.epistasis[[5]]$ids
[1] "-A" "B" "C"
$long.epistasis[[5]]$s
[1] 0.02
$long.epistasis[[5]]$NumID
A B C
-1 2 3
$long.epistasis[[6]]
$long.epistasis[[6]]$ids
[1] "-B" "A" "C"
$long.epistasis[[6]]$s
[1] 0.02
$long.epistasis[[6]]$NumID
B A C
-2 1 3
$long.epistasis[[7]]
$long.epistasis[[7]]$ids
[1] "A" "B" "C"
$long.epistasis[[7]]$s
[1] -0.5
$long.epistasis[[7]]$NumID
A B C
1 2 3
$long.orderEffects
list()
$long.geneNoInt
data frame with 0 columns and 0 rows
$geneModule
Gene Module GeneNumID ModuleNumID
1 Root Root 0 0
2 APC A 1 1
3 FBXW7 D 2 4
4 KRAS C 3 3
5 TP53 B 4 2
$gMOneToOne
[1] FALSE
$geneToModule
Root A B C D
"Root" "APC" "TP53" "KRAS" "FBXW7"
$graph
IGRAPH b304fee DN-- 5 8 --
+ attr: name (v/c), typeDep (e/c), color (e/c), lty (e/n), arrow.mode
| (e/n)
+ edges from b304fee (vertex names):
[1] Root->A Root->B A ->C B ->C C ->D A ->B A ->C B ->C
$drv
[1] 1 2 3 4
$rT
parent child s sh typeDep
1 Root A 0.20 -0.5 XMPN
2 Root B 0.10 -0.5 XMPN
3 A C 0.05 -0.5 XMPN
4 B C 0.05 -0.5 XMPN
5 C D 0.01 -0.5 MN
$epistasis
A : -B : -C -A : B : -C -A : -B : C A : B : -C -A : B : C -B : A : C
0.20 0.10 0.05 0.01 0.02 0.02
A : B : C
-0.50
$orderEffects
NULL
$noIntGenes
NULL
$fitnessLandscape
[,1]
$fitnessLandscape_df
data frame with 0 columns and 0 rows
$fitnessLandscape_gene_id
data frame with 0 columns and 0 rows
$fitnessLandscapeVariables
character(0)
$frequencyDependentFitness
[1] FALSE
$frequencyType
[1] "freq_dep_not_used"
attr(,"class")
[1] "fitnessEffects"
# DAG representation
plot(CRC_W4, expandModules = TRUE, autofit = TRUE, lwdf = 2)
## Map genotypes to fitness
CRC_F2 <- evalAllGenotypes(CRC_W4, order = FALSE, addwt = TRUE)
(CRC_F2)
| Genotype | Fitness |
|---|---|
| <chr> | <dbl> |
| WT | 1.000000 |
| APC | 1.440000 |
| FBXW7 | 0.500000 |
| KRAS | 0.525000 |
| TP53 | 1.210000 |
| APC, FBXW7 | 0.720000 |
| APC, KRAS | 1.285200 |
| APC, TP53 | 1.333200 |
| FBXW7, KRAS | 0.530250 |
| FBXW7, TP53 | 0.605000 |
| KRAS, TP53 | 1.178100 |
| APC, FBXW7, KRAS | 1.298052 |
| APC, FBXW7, TP53 | 0.666600 |
| APC, KRAS, TP53 | 0.330000 |
| FBXW7, KRAS, TP53 | 1.189881 |
| APC, FBXW7, KRAS, TP53 | 0.333300 |
## Plot of fitness landscape
plot(CRC_F2, use_ggrepel = TRUE)
Highly deleterious effects if APC, TP53, or KRAS appear independently
Slightly deleterious effects were set if two of those genes appear in a genotype
Benefitial effect for the genotype of the three genes
Restrictions imposed with SM
## Simplified model
## SM because synthetic viability requires both parent nodes.
## Define poset restrictions, mapping of genes to modules, and driver genes
(CRC_W6 <- allFitnessEffects(data.frame(parent = c(rep("Root", 2), "A", "B", "C"),
child = c("A", "B", rep("C", 2), "D"),
s = c(0.2, 0.1, rep(0.05, 2), 0.01),
sh = -0.5,
typeDep = c(rep("MN", 5))),
epistasis = c("A : -B : -C" = -0.2,
"-A : B : -C" = -0.2,
"-A : -B : C" = -0.3,
"A : B : -C" = -0.05,
"-A : B : C" = -0.01,
"A : -B : C" = -0.01,
"A : B : C" = 0.5),
geneToModule = c("Root" = "Root",
"A" = "APC",
"B" = "TP53",
"C" = "KRAS",
"D" = "FBXW7"),
drvNames = c("APC", "TP53", "KRAS", "FBXW7")))
$long.rt
$long.rt$A
$long.rt$A$child
[1] "A"
$long.rt$A$s
[1] 0.2
$long.rt$A$sh
[1] -0.5
$long.rt$A$typeDep
MN
"monotone"
$long.rt$A$parents
[1] "Root"
$long.rt$A$childNumID
A
1
$long.rt$A$parentsNumID
Root
0
$long.rt$B
$long.rt$B$child
[1] "B"
$long.rt$B$s
[1] 0.1
$long.rt$B$sh
[1] -0.5
$long.rt$B$typeDep
MN
"monotone"
$long.rt$B$parents
[1] "Root"
$long.rt$B$childNumID
B
2
$long.rt$B$parentsNumID
Root
0
$long.rt$C
$long.rt$C$child
[1] "C"
$long.rt$C$s
[1] 0.05
$long.rt$C$sh
[1] -0.5
$long.rt$C$typeDep
MN
"monotone"
$long.rt$C$parents
[1] "A" "B"
$long.rt$C$childNumID
C
3
$long.rt$C$parentsNumID
A B
1 2
$long.rt$D
$long.rt$D$child
[1] "D"
$long.rt$D$s
[1] 0.01
$long.rt$D$sh
[1] -0.5
$long.rt$D$typeDep
MN
"monotone"
$long.rt$D$parents
[1] "C"
$long.rt$D$childNumID
D
4
$long.rt$D$parentsNumID
C
3
$long.epistasis
$long.epistasis[[1]]
$long.epistasis[[1]]$ids
[1] "-C" "-B" "A"
$long.epistasis[[1]]$s
[1] -0.2
$long.epistasis[[1]]$NumID
C B A
-3 -2 1
$long.epistasis[[2]]
$long.epistasis[[2]]$ids
[1] "-C" "-A" "B"
$long.epistasis[[2]]$s
[1] -0.2
$long.epistasis[[2]]$NumID
C A B
-3 -1 2
$long.epistasis[[3]]
$long.epistasis[[3]]$ids
[1] "-B" "-A" "C"
$long.epistasis[[3]]$s
[1] -0.3
$long.epistasis[[3]]$NumID
B A C
-2 -1 3
$long.epistasis[[4]]
$long.epistasis[[4]]$ids
[1] "-C" "A" "B"
$long.epistasis[[4]]$s
[1] -0.05
$long.epistasis[[4]]$NumID
C A B
-3 1 2
$long.epistasis[[5]]
$long.epistasis[[5]]$ids
[1] "-A" "B" "C"
$long.epistasis[[5]]$s
[1] -0.01
$long.epistasis[[5]]$NumID
A B C
-1 2 3
$long.epistasis[[6]]
$long.epistasis[[6]]$ids
[1] "-B" "A" "C"
$long.epistasis[[6]]$s
[1] -0.01
$long.epistasis[[6]]$NumID
B A C
-2 1 3
$long.epistasis[[7]]
$long.epistasis[[7]]$ids
[1] "A" "B" "C"
$long.epistasis[[7]]$s
[1] 0.5
$long.epistasis[[7]]$NumID
A B C
1 2 3
$long.orderEffects
list()
$long.geneNoInt
data frame with 0 columns and 0 rows
$geneModule
Gene Module GeneNumID ModuleNumID
1 Root Root 0 0
2 APC A 1 1
3 FBXW7 D 2 4
4 KRAS C 3 3
5 TP53 B 4 2
$gMOneToOne
[1] FALSE
$geneToModule
Root A B C D
"Root" "APC" "TP53" "KRAS" "FBXW7"
$graph
IGRAPH f2a87fd DN-- 5 8 --
+ attr: name (v/c), typeDep (e/c), color (e/c), lty (e/n), arrow.mode
| (e/n)
+ edges from f2a87fd (vertex names):
[1] Root->A Root->B A ->C B ->C C ->D A ->B A ->C B ->C
$drv
[1] 1 2 3 4
$rT
parent child s sh typeDep
1 Root A 0.20 -0.5 MN
2 Root B 0.10 -0.5 MN
3 A C 0.05 -0.5 MN
4 B C 0.05 -0.5 MN
5 C D 0.01 -0.5 MN
$epistasis
A : -B : -C -A : B : -C -A : -B : C A : B : -C -A : B : C A : -B : C
-0.20 -0.20 -0.30 -0.05 -0.01 -0.01
A : B : C
0.50
$orderEffects
NULL
$noIntGenes
NULL
$fitnessLandscape
[,1]
$fitnessLandscape_df
data frame with 0 columns and 0 rows
$fitnessLandscape_gene_id
data frame with 0 columns and 0 rows
$fitnessLandscapeVariables
character(0)
$frequencyDependentFitness
[1] FALSE
$frequencyType
[1] "freq_dep_not_used"
attr(,"class")
[1] "fitnessEffects"
# DAG representation
plot(CRC_W6, expandModules = TRUE, autofit = TRUE, lwdf = 2)
## Map genotypes to fitness
CRC_F4 <- evalAllGenotypes(CRC_W6, order = FALSE, addwt = TRUE)
(CRC_F4)
| Genotype | Fitness |
|---|---|
| <chr> | <dbl> |
| WT | 1.000000 |
| APC | 0.960000 |
| FBXW7 | 0.500000 |
| KRAS | 0.350000 |
| TP53 | 0.880000 |
| APC, FBXW7 | 0.480000 |
| APC, KRAS | 0.594000 |
| APC, TP53 | 1.254000 |
| FBXW7, KRAS | 0.353500 |
| FBXW7, TP53 | 0.440000 |
| KRAS, TP53 | 0.544500 |
| APC, FBXW7, KRAS | 0.599940 |
| APC, FBXW7, TP53 | 0.627000 |
| APC, KRAS, TP53 | 2.079000 |
| FBXW7, KRAS, TP53 | 0.549945 |
| APC, FBXW7, KRAS, TP53 | 2.099790 |
## Plot of fitness landscape
plot(CRC_F4, use_ggrepel = TRUE)
This model assumes driver genes are over-represented among those mutated across a large tumor collection and, thus, they can be identified in terms of frequency. Also, those participating of the same pathway are mutated in a mutually exclusive manner because more than one mutation in a pathway does not give any selective advantage to the clone. analyze one large dataset of colorectal adenocarcinoma (COADREAD) from IntOGen-mutations database.
## Restriction table, including DAG of restrictions specifications and associated fitness
COADREAD_rT <- data.frame(
parent = c("Root", "A", "B", "C", "D", "E", "F"), # Parent nodes
child = c("A", "B", "C", "D", "E", "F", "G"), # Child nodes
s = 0.5,
sh = c(rep(-1, 4), rep(-.5, 2), -.2),
typeDep = "MN")
## Create fitness specifications from DAG of restrictions considering modules
COADREAD_fitness <- allFitnessEffects(
COADREAD_rT,
geneToModule = c(
"Root" = "Root",
"A" = "APC",
"B" = "TP53",
"C" = "KRAS",
"D" = "PIK3CA, NRAS",
"E" = "FBXW7, ARID1A",
"F" = "ATM, SMAD2",
"G" = "SOX9, SMAD4")) # Modules
## DAG of restrictions representation
plot(COADREAD_fitness, expandModules = TRUE, autofit = TRUE)
## Evaluation of all possible genotypes fitness
## under the previous fitness specifications
COADREAD_FL <- evalAllGenotypes(COADREAD_fitness, max = 131072)
## Fitness landscape representation
plotFitnessLandscape(COADREAD_FL)
## Restriction table, including five-modules DAG of
## restrictions specifications and associated fitness
COADREAD_rT_5d <- data.frame(parent = c("Root", "A", "B", "C"), # Parent nodes
child = c("A", "B", "C", "D"), # Child nodes
s = 0.5,
sh = c(rep(-1, 4)),
typeDep = "MN")
## Create fitness specifications from simplified DAG of restrictions
COADREAD_fitness_5d <- allFitnessEffects(COADREAD_rT_5d,
geneToModule = c( "Root" = "Root",
"A" = "APC",
"B" = "TP53",
"C" = "KRAS",
"D" = "PIK3CA, NRAS"),
drvNames = c("APC", "TP53", "KRAS",
"PIK3CA", "NRAS"))
## Simplified DAG of restrictions representation
plot(COADREAD_fitness_5d, expandModules = TRUE, autofit = TRUE)
## Evaluation of all possible genotypes fitness under the previous fitness specifications
(COADREAD_FL_5d <- evalAllGenotypes(COADREAD_fitness_5d))
| Genotype | Fitness |
|---|---|
| <chr> | <dbl> |
| APC | 1.5000 |
| KRAS | 0.0000 |
| NRAS | 0.0000 |
| PIK3CA | 0.0000 |
| TP53 | 0.0000 |
| APC, KRAS | 0.0000 |
| APC, NRAS | 0.0000 |
| APC, PIK3CA | 0.0000 |
| APC, TP53 | 2.2500 |
| KRAS, NRAS | 0.0000 |
| KRAS, PIK3CA | 0.0000 |
| KRAS, TP53 | 0.0000 |
| NRAS, PIK3CA | 0.0000 |
| NRAS, TP53 | 0.0000 |
| PIK3CA, TP53 | 0.0000 |
| APC, KRAS, NRAS | 0.0000 |
| APC, KRAS, PIK3CA | 0.0000 |
| APC, KRAS, TP53 | 3.3750 |
| APC, NRAS, PIK3CA | 0.0000 |
| APC, NRAS, TP53 | 0.0000 |
| APC, PIK3CA, TP53 | 0.0000 |
| KRAS, NRAS, PIK3CA | 0.0000 |
| KRAS, NRAS, TP53 | 0.0000 |
| KRAS, PIK3CA, TP53 | 0.0000 |
| NRAS, PIK3CA, TP53 | 0.0000 |
| APC, KRAS, NRAS, PIK3CA | 0.0000 |
| APC, KRAS, NRAS, TP53 | 5.0625 |
| APC, KRAS, PIK3CA, TP53 | 5.0625 |
| APC, NRAS, PIK3CA, TP53 | 0.0000 |
| KRAS, NRAS, PIK3CA, TP53 | 0.0000 |
| APC, KRAS, NRAS, PIK3CA, TP53 | 5.0625 |
## Fitness landscape representation
plotFitnessLandscape(COADREAD_FL_5d, use_ggrepel = TRUE)
Warning message: “ggrepel: 21 unlabeled data points (too many overlaps). Consider increasing max.overlaps”
Clones that coexist in a tumor can influence the fitness of each other in a frequency-dependent manner.

## Mapping of genotypes to frequency-dependent fitness
# Not explicitly mapped genotypes are assigned a fitness of cero
COADREAD_gen_freqdep <- data.frame(
Genotype = c("WT", "APC","APC, TP53",
"APC, TP53, KRAS",
"APC, TP53, KRAS, PIK3CA",
"APC, TP53, KRAS, NRAS",
"APC, TP53, KRAS, PIK3CA, NRAS"),
Fitness = c("1", "1.5",
"2.25", "3.375", "5.0625", "5.0625",
"5.0625 - ((f_APC_TP53_KRAS_PIK3CA + f_APC_TP53_KRAS_NRAS))/2"),
stringsAsFactors = FALSE)
## Fitness specifications
COADREAD_fitness_freqdep <- allFitnessEffects(genotFitness = COADREAD_gen_freqdep,
frequencyDependentFitness = TRUE,
frequencyType = "rel")
## Evaluate all genotypes considering population sizes of the clones
(COADREAD_FL_freqdep <- evalAllGenotypes(COADREAD_fitness_freqdep,
spPopSizes = c("WT" = 5, "APC" = 5, "APC, TP53" = 5,
"APC, TP53, KRAS" = 10,
"APC, TP53, KRAS, PIK3CA" = 50,
"APC, TP53, KRAS, NRAS" = 50,
"APC, TP53, KRAS, PIK3CA, NRAS" = 50)))
| Genotype | Fitness |
|---|---|
| <chr> | <dbl> |
| WT | 1.000000 |
| APC | 1.500000 |
| KRAS | 0.000000 |
| NRAS | 0.000000 |
| PIK3CA | 0.000000 |
| TP53 | 0.000000 |
| APC, KRAS | 0.000000 |
| APC, NRAS | 0.000000 |
| APC, PIK3CA | 0.000000 |
| APC, TP53 | 2.250000 |
| KRAS, NRAS | 0.000000 |
| KRAS, PIK3CA | 0.000000 |
| KRAS, TP53 | 0.000000 |
| NRAS, PIK3CA | 0.000000 |
| NRAS, TP53 | 0.000000 |
| PIK3CA, TP53 | 0.000000 |
| APC, KRAS, NRAS | 0.000000 |
| APC, KRAS, PIK3CA | 0.000000 |
| APC, KRAS, TP53 | 3.375000 |
| APC, NRAS, PIK3CA | 0.000000 |
| APC, NRAS, TP53 | 0.000000 |
| APC, PIK3CA, TP53 | 0.000000 |
| KRAS, NRAS, PIK3CA | 0.000000 |
| KRAS, NRAS, TP53 | 0.000000 |
| KRAS, PIK3CA, TP53 | 0.000000 |
| NRAS, PIK3CA, TP53 | 0.000000 |
| APC, KRAS, NRAS, PIK3CA | 0.000000 |
| APC, KRAS, NRAS, TP53 | 5.062500 |
| APC, KRAS, PIK3CA, TP53 | 5.062500 |
| APC, NRAS, PIK3CA, TP53 | 0.000000 |
| KRAS, NRAS, PIK3CA, TP53 | 0.000000 |
| APC, KRAS, NRAS, PIK3CA, TP53 | 4.776786 |
## Fitness landscape representation
plotFitnessLandscape(COADREAD_FL_freqdep)
Fitness landscapes include all possible paths of tumor progression and enable the study of different outcomes depending on the selected evolutionary model
We have mapped various cancer progression models (mutual exclusivity assumptions) to different evolutionary models and analyzed the resulting fitness landscapes
All this evidence supports the idea that cancer progression is better explained through an evolutionary perspective
Current cancer progression models cannot capture these phenomena and, thus, conclusions derived from them should be questioned
OncoSimulR is a very complete and convenient clone-based genetic simulator to evaluate cancer progression under evolutionary models, but is limited to:
